Data-driven detection and diagnosis of system-level failures in middleware-based service compositions

نویسنده

  • Bruno Wassermann
چکیده

Service-oriented technologies have simplified the development of large, complex software systems that span administrative boundaries. Developers have been enabled to build applications as compositions of services through middleware that hides much of the underlying complexity. The resulting applications inhabit complex, multi-tier operating environments that pose many challenges to their reliable operation and often lead to failures at runtime. Two key aspects of the time to repair a failure are the time to its detection and to the diagnosis of its cause. The prevalent approach to detection and diagnosis is primarily based on ad-hoc monitoring as well as operator experience and intuition. This is inefficient and leads to decreased availability. We propose an approach to data-driven detection and diagnosis in order to decrease the repair time of failures in middleware-based service compositions. Data-driven diagnosis supports system operators with information about the operation and structure of a service composition. We discuss how middleware-based service compositions can be monitored in a comprehensive, yet non-intrusive manner and present a process to discover system structure by processing deployment information that is commonly reified in such systems. We perform a controlled experiment that compares the performance of 22 participants using either a standard or the data-driven approach to diagnose several failures injected into a real-world service composition. We find that system operators using the latter approach are able to achieve significantly higher success rates and lower diagnosis times. Data-driven detection is based on the automation of failure detection through applying an outlier detection technique to multi-variate monitoring data. We evaluate the effectiveness of one-class classification for this purpose and determine a simple approach to select subsets of metrics that afford highly accurate failure detection.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Access control in ultra-large-scale systems using a data-centric middleware

  The primary characteristic of an Ultra-Large-Scale (ULS) system is ultra-large size on any related dimension. A ULS system is generally considered as a system-of-systems with heterogeneous nodes and autonomous domains. As the size of a system-of-systems grows, and interoperability demand between sub-systems is increased, achieving more scalable and dynamic access control system becomes an im...

متن کامل

Policy-Driven Middleware for Self-adaptation of Web Services Compositions

We present our policy-based middleware, called Manageable and Adaptive Service Compositions (MASC), for dynamic self-adaptation of Web services compositions to various changes. MASC integrates and extends our earlier middleware called the Web Services Message Bus (wsBus). In particular, we discuss MASC support for customization of Web services compositions to address business exceptions and wsB...

متن کامل

Analysis and Diagnosis of Partial Discharge of Power Capacitors Using Extension Neural Network Algorithm and Synchronous Detection Based Chaos Theory

Power capacitors are important equipment of the power systems that are being operated in high voltage levels at high temperatures for long periods. As time goes on, their insulation fracture rate increases, and partial discharge is the most important cause of their fracture. Therefore, fast and accurate methods have great importance to accurately diagnosis the partial discharge. Conventional me...

متن کامل

Pic Microcontroller-Based Automatic Meter Reading (AMR) System Using the Low Voltage (LV) Power Line Network (TECHNICAL NOTE)

Automatic Meter Reading (AMR) is the remote collection of consumption data from customer’s utility meters over telecommunications, radio, power line and other links. AMR provides water, electric and gas utility service companies the opportunity to streamline metering, billing and collection activities, increase operational efficiency and improve customer service. The AMR system consists of thre...

متن کامل

Robust Model- Based Fault Detection and Isolation for V47/660kW Wind Turbine

In this paper, in order to increase the efficiency, to reduce the cost and to prevent the failures of wind turbines, which lead to an extensive break down, a robust fault diagnosis system is proposed for V47/660kW wind turbine operated in Manjil wind farm, Gilan province, Iran. According to the acquired data from Iran wind turbine industry, common faults of the wind turbine such as sensor fault...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012